Application
This unit describes the skills and knowledge required to cluster data extracts from big data following unsupervised machine learning methodologies and report on the findings.
It applies to individuals who work in roles including, data analysts, data scientists, machine learning engineers, developers and programmers, and are responsible for data mining and machine learning activities with big data within medium to large organisations.
No licensing, legislative or certification requirements apply to this unit at the time of publication.
Elements and Performance Criteria
1.Determine data clustering requirements | 1.1 Research organisation’s need for data clustering and define problem, objective and outputs 1.2 Determine required machine and input data set according to task requirements 1.3 Define evaluation protocol and accepted measure of success 1.4 Develop and document required benchmark model |
2. Prepare data | 2.1 Collect data according to task requirements 2.2 Evaluate data quantity, completeness and alignment according to task requirements 2.3 Transform and format data according to specifications 2.4 Finalise data preparation according to task requirements |
3. Cluster data | 3.1 Input raw data according to task requirements 3.2 Run required algorithm and adhere to required processing time frame 3.3 Obtain output reports and determine completeness of task according requirements |
4. Finalise data clustering tasks | 4.1 Analyse data report and determine clustering tasks have been completed according to task requirements 4.2 Interpret, summarise and document findings 4.3 Communicate findings to required personnel and seek and respond to feedback 4.4 Lodge documentation according to task requirements and finalise task activities according to organisational requirements |
Evidence of Performance
The candidate must demonstrate the ability to complete the tasks outlined in the elements, performance criteria and foundation skills of this unit, including evidence of the ability to:
collect, prepare and cluster data using unsupervised machine learning methodologies and report on the findings on at least two occasions.
In the course of the above, the candidate must:
research industry standard approaches and methodologies for machine learning
evaluate and prepare data.
Evidence of Knowledge
The candidate must be able to demonstrate knowledge to complete the tasks outlined in the elements, performance criteria and foundation skills of this unit, including knowledge of:
methodologies for data clustering unlabelled data including intra-cluster cohesion and intra-cluster separation
industry standard data clustering methodologies including benchmark modelling techniques for data clustering
report writing methodologies relevant to reporting findings of data clustering activities
industry standard machine learning methodologies relevant to unsupervised learning
methodologies for modelling data relevant to unsupervised learning.
Assessment Conditions
Assessment must be conducted in a safe environment where evidence gathered demonstrates consistent performance of typical activities experienced in the customer service field of work and include access to:
hardware and software and components required for using unsupervised learning for clustering
organisational data reporting style guide and reporting processes required for unsupervised learning and machine learning
a site where activities can be carried out.
data required for clustering.
Assessors of this unit must satisfy the requirements for assessors in applicable vocational education and training legislation, frameworks and/or standards.
Foundation Skills
Numeracy | Uses mathematical formulae to calculate required measurements, determine values and articulate numerical findings |
Oral communication | Uses listening and questioning techniques to seek and respond to feedback |
Reading | Analyses technical, manufacturer and organisational documentation to determine and confirm job requirements |
Writing | Prepares complex documentation detailing benchmark model and findings using relevant language to convey explicit information, requirements and recommendations |
Planning and organising | Uses a formal, logical planning processes together with an increasingly intuitive understanding of context |
Problem solving | Uses nuanced understanding of context to recognise anomalies and subtle deviations to normal expectations, focusing attention and remedying problems as they arise |
Self-management | Takes full responsibility for identifying and considering relevant organisational protocols and requirements Uses systematic processes, setting goals, gathering required information and identifying and evaluating options against agreed criteria |
Technology | Identifies principles, concepts, language and practices associated with the digital world |
Sectors
Data analytics